Helicobacter pylori haemagglutinin protease protein, nucleic acid encoding therefor and antibodies specific thereto

ABSTRACT

Helicobacter pylori (H.pylori) haemagglutinin/protease protein, nucleic acids encoding therefor and antibodies specific thereto are described and, in particular, to their use in the identification of H.pylori and in the diagnosis of H.pylori infection. Also described are kits for the identification and diagnosis of H.pylori infection.

The present invention relates to Helicobacter pylori (H.pylori)haemagglutinin/protease protein, nucleic acids encoding therefor andantibodies specific thereto and, in particular, to their use in theidentification of H.pylori and in the diagnosis of H.pylori infection.

H.pylori (formerly Campylobacter pyloridis or C.pylori) is aspiral-shaped Gram negative microorganism which appears to live beneaththe mucus layer of the stomach. Since its first isolation in 1982H.pylori has been associated with gastric and duodenal ulcer disease andgastric cancer. H.pylori has been described as the most chronicinfectious agent of man. Reviews on the state of the art include thoseby C. A. M. McNulty in J. Infection, 1986, 13, 107-113, C. S. Goodwin etal. in J. Clin. Pathol., 1986, 39, 353-365 and the Eurogast Study Group,Lancet, 1993, 341, 1359-1362.

The number of genes that encode proteins that are involved in theability of H.pylori to cause disease is unknown and virulencedeterminants of H.pylori have so far not been identified. A number ofdeterminants possessed by this organism have been proposed as possiblepathogenic factors. For example, multiple flagella allow themicroorganism to move rapidly by a corkscrew-like motion through highlyviscous fluids such as the mucus layer of the gut which normally poses abarrier to bacteria en route to the gut epithelium. Also, the ability ofH.pylori to produce mucinase digesting enzymes allows the organism tospread in the stomach. Microscopic studies of gastric biopsies frompatients with H.pylori infection have shown the H.pylori organisms atspecific sites on the gastric epithelial cells. Biochemical studies havereported the identification of haemagglutinins that allow H.pylori toadhere to these sites.

Extracellular metalloprotease enzymes are common microbial pathogenicityfactors in bacteria causing disease in mammals. Zinc metalloproteaseenzymes are known to have rapid substrate turnover and broad substrateprofiles. Reviews on the state of the art are by Frausto da Silva J. J.R and Williams R. J. P in, The Biological Chemistry of the Elements,1993, Clarendon Press, Oxford, Chapter 11, and Vallee, B. L. and Auld D.S., Biochemistry, 29, 5647-5659.

The zinc metalloprotease enzyme of Pseudomonas aeruginosa (also known asthe elastase enzyme) has been shown to be important in the lungtissue-destructive processes caused by this organism in cystic fibrosispatients, (Bever R. A. and Iglewski B. H., J.Bacteriol., 1988, 170,4309-4314). Similarly, the zinc metalloprotease enzyme of Vibriocholerae (V.cholerae) (also known as the mucinase enzyme orhaemagglutinin/protease (HAP) enzyme) has been shown to be important inthe attachment and detachment of these organisms during the diseasecholera. References on the state of the art include: Hase C. C. andFinkelstein R. A., J. Bacteriol., 1991, 173, 3311-3317; and FinkelsteinR. A. et al., Infect.Immunol. 1992, 60 472-478.

We have now surprisingly found that a virulance gene almost identical tothe V.cholerae hap gene, which we have termed the H.pylori hap gene, ispresent in the H.pylori genome. The detection of the H.pylori hapnucleic acid sequence by polymerase chain reaction (PCR) or otherhybridization methods, the detection of H.pylori HAP protein epitopes byantibody detection methods, or the detection of antibodies to theH.pylori HAP protein by, for example ELISA, have utility in thediagnosis of H.pylori mediated gastroduodenal disorders in mammals.

Prior art in the diagnosis of H.pylori mediated gastric diseases includeCampylobacter DNA probes capable of hybridizing H.pylori RNA(EP-A-0350205). H.pylori oligonucleotides specific for the H.pyloriurease gene sequences are disclosed in WO 91/09049. Serologicaldetection and diagnosis of H.pylori infection by serologicalimmunoassays and detection of H.pylori antigens and antigenic fragmentsare disclosed in WO 89/08843, WO 89/09497 and EP-A-0329570.

It would be highly desirable to have a reliable means of detectingH.pylori DNA, RNA or antibodies directed against specific H.pyloriproteins or H.pylori proteins themselves in clinical samples from apatient (for example in the gastric mucosa, saliva, faecal samplesplasma or serum), as means of early diagnosis of gastritis, gastric orpeptic ulcerations, or of gastric cancer.

Accordingly, the present invention provides a nucleic acid sequenceencoding the H.pylori HAP protein or a fragment thereof comprising allor part of the nucleic acid sequence of FIG. 1 known as Sequence No. 1.

The H.pylori hap gene of the present invention has been sequenced,giving rise to the possibility of constructing selected oligonucleotideor protein epitope sequences specific to H.pylori. The sequencedH.pylori hap gene has been found to be over 99% similar to theV.cholerae hap gene in the coding region. The nucleotide sequence of thecloned 1.5 kb H.pylori hap gene fragment showing the 1 kb of codingsequence and about 500 bp of 3' flanking sequence is shown in FIG. 1.

Preferably the nucleic acid sequence encodes at least one antigenicdeterminant of H.pylori HAP protein. Preferably the nucleic acidsequence comprises the sequences of bases numbered from 1 to 936 ofFIG. 1. Also provided by the present invention is a nucleic acidsequence which is complementary to the H.pylori hap nucleic acidsequence as defined above. The nucleic acid sequence may comprisegenomic DNA, complementary DNA (cDNA), synthetic DNA or recombinant DNAor RNA.

The nucleic acid sequence may comprise an oligonucleotide of from 15 to50 nucleotides preferably from 18 to 50 nucleotides which has specificbinding affinity for a portion of the nucleic acid sequence as shown inFIG. 1, or a nucleic acid sequence complementary thereto. Theoligonucleotide may also be from 15 to 30 nucleotides, preferably from15 to 25 nucleotides. Preferably the oligonucleotide comprises asequence of at least 15 or more nucleotides and includes the nucleotidesnumbered 16 to 18 (inclusive), 220 to 222 (inclusive) or 43 to 45(inclusive) as shown in FIG. 1.

Preferably the oligonucleotides are any of the following sequences:

5'- GCACAGGCAACAGGAACC-3' known as Sequence No. 2; or

5'- AACGAGGCCTGAATTCTGC-3' known as Sequence No. 3; or

5'- ATAACGTAGACCACCGGAGG-3' known as Sequence No. 5; or

5'- TCCGGTGGTATTAACGAAGC-3'.

The oligonucleotides may comprise DNA or RNA sequences.

The oligonucleotides within the scope of the present invention includeboth single- and double-stranded versions, it being understood that inany hybridization procedures such double stranded probes will requiredenaturing to provide the probes in single-stranded form.

The present invention further provides vectors which comprise anynucleic acid sequence as hereinbefore defined. The present inventionspecifically contemplates the provision of any vector system known inthe art including cloning vectors such as pUC18 and pUC19 as well asexpression vectors such as pASK60-Strep. Further provided by the presentinvention is a host cell transformed with one or more such vectors. Thepresent invention also provides a process for the production of DNAsequences as hereinbefore defined comprising culturing a host cell whichhas been transformed with one or more vectors comprising the DNAsequence and isolating the DNA sequence therefrom. Such a process iscarried out according to conditions and procedures well known in theart.

Further provided by the present invention is H.pylori HAP protein or afragment thereof comprising all or a part of the amino acid sequence ofFIG. 1 known as Sequence No. 6. Preferably the fragment of the HAPprotein is an antigenic determinant of H.pylori HAP protein. Preferablythe fragment of the HAP protein is encoded by the sequence of basesnumbered from 1 to 936 of FIG. 1.

The H.pylori protein or fragment thereof within the scope by thisinvention include the H.pylori HAP protein itself, being purified fromH.pylori or being produced as a recombinant protein in, for exampleEscherichia coli or Bacillus subtilis, it being understood that thesubsequent diagnostic procedures such as the detection of the H.pyloriprotein in the gastric mucosa or other secretions or products of thegastrointestinal tract are within the scope of this invention.

The present invention also provides a process for the production ofH.pylori HAP protein or a fragment thereof as hereinbefore describedcomprising culturing a host cell transformed with an expression vectoras hereinbefore defined and isolating the protein or protein fragmentproduced therefrom.

It will be understood that in accordance with the present invention adefined nucleic acid sequence includes not only the identical nucleicacid sequence but also any minor base variations from the naturalnucleic acid sequence including, in particular, substitutions in baseswhich result in a synonym codon (a different codon specifying the sameamino acid). Furthermore a defined protein, polypeptide or amino acidsequence includes not only the identical amino acid sequence but alsominor amino acid variations from the natural amino acid sequenceincluding, in particular, conservative amino acid replacements (areplacement by an amino acid that is related in its side chains). Alsoincluded are amino acid sequences which vary from the natural amino acidbut result in a polypeptide which is immunologically identical with thepolypeptide encoded by the naturally occurring sequence. This includes acorrespondingly altered encoding nucleic acid sequence.

The present invention also provides polyclonal and monoclonal antibodieswhich recognize an antigenic determinant of H.pylori HAP protein or afragment thereof. The H.pylori specific antibodies covered by thisinvention include both polyclonal or monoclonal antibodies directedagainst the H.pylori HAP protein or a fragment thereof, it beingunderstood that the subsequent diagnostic procedures such as thedetection of antibodies against H.pylori in the gastric mucosa or othersecretions or products of the gastrointestinal tract are within thescope of this invention.

The polyclonal and monoclonal antibodies are produced by standardtechniques known in the art, for example monoclonal antibodies areproduced by the techniques as described in Kohler F. & Milstein C.,(1975), Nature, 256, 495-497.

In accordance with the present invention, the specific H.pylori HAPprotein is useful in raising antibodies to the H.pylori HAP protein inexperimental mammals and these specific anti-H.pylori HAP antibodies areuseful in detecting the H.pylori HAP protein in clinical samples from apatient. The H.pylori HAP protein is also useful in ELISA or Westernblot analysis, and can be purified from H.pylori organisms or producedby recombinant DNA technology, the procedures for which are all wellrecognised and well within the capabilities of the person skilled in theart.

Further provided by the present invention are hybridomas capable ofproducing monoclonal antibodies which recognize an antigenic determinantof H.pylori HAP protein, or a fragment thereof.

The present invention also provides a process for the amplification of anucleic acid sequence as hereinbefore described by polymerase chainreaction (PCR) or equivalent technique known in the art such astranscription-aided amplification system (TAS). Preferably the PCRprocess is effected using the oligonucleotide pairs: ##STR1##

The present invention also provides nucleic acid probes comprising anucleic acid sequence or part thereof as shown in FIG. 1. Also providedis a process for the identification of H.pylori nucleic acid comprisingcontacting a sample to be tested with a nucleic acid probe ashereinbefore described, under appropriate conditions known in the art,and detecting any hybridization of H.pylori nucleic acid sequence orsequences with the probe.

The selection of a particular probe, or pair of probes, will depend upona number of factors, well understood in the art, and including amongstothers the stringency requirements, i.e. the ability or otherwise of theprobe to tolerate mismatching with the complementary sequence in thetarget DNA. Obviously the longer the probe the better the ability towithstand local mismatch without adversely affecting the hybridizationof the probe to the target DNA. The factors affecting the choice ofprobe are well recognised and well within the capabilities of the personskilled in the art.

Also provided is a process for the identification of H.pylori nucleicacid comprising amplifying, in a sample to be tested, any H.pylorinucleic acid sequence as shown in FIG. 1 by PCR or equivalent techniqueand detecting the amplified nucleic acid sequences.

Although PCR amplification is the preferred method of H.pylori detectionusing H.pylori specific nucleotides of this invention, other detectionprocedures are available and are well known in the art. To this end theH.pylori specific oligonucleotides or nucleic acid probes of thisinvention may be provided with a variety of different labels such asradioactive, fluorescent or enzyme labels, all permitting the detectionof any hybridized nucleotide bound to the unidentified nucleic acidsample under investigation.

Further provided is a method for the identification of H.pylori HAPprotein antigenic determinants comprising contacting a sample to betested with an antibody according to the present invention and detectingthe presence of an antibody-antigen complex. Synonomously, there is alsoprovided a method for the identification of H.pylori infectioncomprising contacting a sample to be tested with an H.pylori HAP proteinor fragment thereof as hereinbefore described, or V.cholerae HAP proteinor fragment thereof, and detecting the presence of an antigen-antibodycomplex. The sample to be tested will generally comprise dental plaque,saliva, gastric juices, or faeces or may comprise a sample of thegastric mucosa. As described above for the H.pylori hap nucleic acidsequences of the present invention the amino acid sequences,polypeptides, protein and antibodies may be provided with a variety ofdifferent labels such as fluoroescent, radioactive or enzymic labels,all permitting the detection of any amino acid sequence or antibody tothe antibody or amino acid sequence sample under investigation,respectively.

H.pylori HAP antigens (the HAP protein or a fragment thereof) orV.cholerae HAP antigens, can be used in immunoassays to detect patientswhom exhibit cross-reacting antibodies. Conversely antibodies can beused in immunoassays to detect patients whom exhibit cross-reactingH.pylori HAP antigens. Correlation can thus be made with H.pyloriinfection-associated gastroduodenal disease. The immunoassayscontemplated by the present invention comprise diagnostic methods knownin the art. The immunoassays may be based on direct antigen-antibodyreactions, competition, single or double sandwich assays and includeamplification systems such as those utilizing biotin and avidin.

The assays may comprise components attached to solid supports such asimmunodiagnostic plates or glass beads or may involveimmunoprecipitation. The immunoassays generally comprise use of alabelled antibody/antigen wherein the label may comprise fluorescent,chemiluminescent, radioactive or dye molecules.

Further, the present invention provides the use of the H.pylori hapnucleic acid, HAP amino acid, polypeptides, protein sequences andantibodies or V.cholerae hap nucleic acid or HAP protein or a fragmentthereof for the manufacture of materials and kits for the diagnosis ofgastric disorders associated with H.pylori. Accordingly kits areprovided for by the present invention which comprise one or more of: anucleic acid sequence encoding H.pylori HAP protein or a fragmentthereof comprising all or part of the nucleic acid sequence of FIG. 1;H.pylori HAP protein or a fragment thereof comprising all or part of theamino acid sequence of FIG. 1; or V.cholerae HAP protein or a fragmentthereof; or a polyclonal or monoclonal antibody which recognizes anantigenic determinant of the H.pylori HAP protein or a fragment thereof.There is also provided a kit for the identification of H.pylori usingthe polymerase chain reaction or equivalent technique comprising atleast one of the pairs of oligonucleotides as hereinbefore described.

Kits according to the present invention may include appropriatelylabelled reagents, additional reagents and materials such as buffersolutions, means for detecting results of the assay and assayinstructions. The kit components may be packaged in a suitablekit-container.

The present invention also provides the use of the following in thediagnosis of H.pylori infection: a nucleic acid sequence encodingH.pylori HAP protein or a fragment thereof comprising all or part of thenucleic acid sequence of FIG. 1; H.pylori HAP protein or a fragmentthereof comprising all or part of the amino acid sequence of FIG. 1; orV.cholerae HAP protein or a fragment thereof; or a polyclonal ormonoclonal antibody which recognizes an antigenic determinant ofH.pylori HAP protein or a fragment thereof.

Thus, the use of the H.pylori hap nucleic acid sequence or fragmentthereof or protein sequence or fragment thereof or antibodies directedagainst the H.pylori HAP protein covered by this invention include themanufacture of a kit or other materials for use in the diagnosis ofgastric disorders associated with H.pylori. The H.pylori hap nucleicacid sequence or fragment thereof or protein sequences or fragmentthereof or anti-H.pylori HAP antibodies may be combined with one or moreother nucleic acid sequence or fragment thereof or protein sequence orfragment thereof or antibodies used in the diagnosis of gastricdisorders associated with H.pylori.

The present invention is further described with reference to thefollowing drawings in which:

FIG. 1 is a nucleotide sequence of the cloned 1.5 kb H.pylori hap genefragment from H.pylori NCTC 11638 showing about 1 kb of coding sequenceand about 500 bp of 3' flanking sequence. The sequence of this region ofthe H.pylori hap gene has been submitted to the EMBL Nucleotide SequenceData Library under the accession number Z27239.

FIG. 2 is a map of the 1.5 kb fragment of the H.pylori hap gene in theplasmid pUC19. The H.pylori hap gene fragment is situated between thetwo BamHI sites.

In FIG. 1. the numbering is from the first adenosine base (A) of theEcoRI site. The PCR primers are shown as dotted overlined arrows atpositions 206-225 (HpHAP3), and 762-780 (HpHAP4). The three basein-frame deletion (marked Δ H) from the V.cholerae hap gene is atposition 222, and the region identical to the V.cholerae hap geneextends from position 1 beyond the coding region (double underlined)except for a single addition at position 798 and a single deletion (of aT) at position 843. The stop codon is indicated by * * *.

Within the identified H.pylori hap nucleic acid sequence two regionshave been identified as comprising coding regions for the active sitecomponent of the protein. The two regions are around nucleic acidsnumbered 16-18 (inclusive) encoding for tyrosine and 220-222 (inclusive)encoding for histidine. The region around nucleic acids numbered 43-45(inclusive) encoding for glutamic acid has been identified as a putativezinc binding encoding region.

Within the sequence in FIG. 1 certain regions of the gene containingsequences have been identified that do not cross-hybridize withCampylobacter DNA. These include the nucleotides numbered as 1-174(inclusive).

The present invention will now be described in more detail withreference to the following examples:

EXAMPLE 1 Growth of H.pylori Strains and Extraction of DNA and Proteins

H.pylori strains NCTC (National Collection of Type Cultures, London)11637, 11638, 11916 and HP 34 (clinical isolate from a biopsy takenduring endoscopy of a patient at Queen's Medical Centre, The UniversityHospital, Nottingham) were grown on Columbia blood agar with 5% horseblood (Oxoid) as a lawn for DNA extraction, for 2 days undermicroaerophilic conditions (Campypak, BBL) at 37° C. The resultinggrowth was harvested from four plates and was first Gram stained toidentify the characteristic morphology. The cells were washed in 1 ml oflysis buffer (50 mM EDTA, 100 mM NaCl), and resuspended in 400 μl oflysis buffer to which 30 μl of lysis buffer containing 20%N-lauroylsarcosine (Sigma) was added. After five minutes incubation atroom temperature, the suspension was repeatedly extracted with phenolsaturated with TE (10 mM Tris-HCI, pH 8, 1 mM EDTA)buffer until nointerface was evident. The nucleic acids were then ethanol precipitatedovernight, collected by centrifugation, washed in 70% ethanol and thepellet air dried for 10 minutes. The DNA was then treated withproteinase K and purified using a Qiagen minicolumn according to themanufacturer's instructions. H.pylori strain NCTC 11638 was grown inliquid culture by adding one harvested plate of culture to 100 ml ofBrucella broth (Difco) containing 2% β-cyclodextrin (Sigma) and 0.2 mlof reconstituted H.pylori selective supplement (Oxoid) in a 500 mlconical flask. The flask was incubated in an anaerobic jar with gentleshaking (100 r.p.m) for 3 days under microaerophilic conditions(Campypak, BBL) at 37° C. The H.pylori cells were collected bycentrifugation and the proteins in the supernatant precipitated aspreviously described in Milton D. L., Norqvist, A., and Wolf Watz, H.,(1992), J. Bacteriol., 174,7235-7244. Cellular proteins were extractedby resuspending two plates of growth in 1.5 ml of protein-extractionbuffer (10 mM Tris pH 7.5, 1 mM MgCl₂, 0.15 mM EDTA, 1 mM DTT, 1 mMPMSF, 2 μgml⁻¹ pepstatin A (Sigma) 0.5 μgml⁻¹ leupeptin (Sigma)), adding200 μl of 10% SDS and boiling for 5 minutes. The suspension was thencooled on ice for 5 minutes and the supernatant collected bycentrifugation at 12500 r.p.m. for five minutes. The proteinconcentration was determined according to Bradford, M., (1976), AnalBiochem, 72,248-252, (Bio-Rad kit).

EXAMPLE 2 Identification of a H.pylori Protease Enzyme

40 μg of total cell and supernatant proteins from H.pylori NCTC 11638and a clinical isolate of P.aeruginosa were separated on verticalminigels (Hoeffer Scientific), comprising a 5% acrylamide stacking geland a 13% resolving gel, according to the procedure of Laemmli, UK.,(1970), Nature, 227, 680-685. Electrophoresis was performed at 20 mA.Half of the gel was stained directly and the other half was incubatedbefore staining to reveal protease activity. The gel portion that wasstained directly was placed in a solution of 0.1% Coomassie brilliantblue R-250 in destain solution (40% methanol, 10% acetic acid, 50%water) for 1 h, and then placed in several changes of destain solution.The unstained gel was incubated, and the protease activity of this gelwas detected by the procedures as described in Milton et al., 1992,J.Bacteriol., 174,7235-7244, except that the gel was finally stainedwith Coomassie brilliant blue R-250 as above rather than amido black.Protease activity was identified in the gel by cleared bands (digestionof the gelatin and other proteins). Protease activity was clearlypresent in the H.pylori cell, supernatant and P.aeruginosa tracks. Inthe H.pylori tracks numerous proteolytic bands were present. Anoverloaded gel (100 μg of total H.pylori protein) run according to theabove procedure showed a clear protease band at about 35 kDa.

The electrophoretic separation was transferred on to a nitrocellulosemembrane by semi-dry blotting using transfer buffer (40 mM Tris pH8, 30mM glycine, 20% methanol, 1.3 mM SDS) and an ATTO AE-6675 Horizblottransfer unit (Genetic Research International) according to themanufacturer's instructions. The nitrocellulose membrane was then driedand placed in blocking solution (1% bovine serum albumin in wash buffer(10 mM Tris pH 7.5, 100 mM NaCl,0.1% Tween 20)) for 1 hour at roomtemperature with constant rocking. The membrane was probed with eitherrabbit anti-P.aeruginosa elastase, Bever and Iglewski., 1988, J.Bacteriol., 170:4309-4314, or pooled human sera absorbed with E.coliaccording to techniques well known in the art. The primary antibodieswere added to the blocking buffer at 1:1000 and incubation was continuedfor 1 to 4 hours. The membrane was then briefly washed twice with washbuffer, once for 15 minutes and once for 5 minutes with rocking. TheHRP-labelled antibody (either anti-rabbit or anti-human) was added tothe membrane at a concentration of 1:1000 in wash buffer and incubatedfor 1 hour as above. The membrane was then washed once for 15 minutes,and four times for 5 minutes with wash buffer as above. The membrane wasthen developed using ECL substrate reagents (Amersham) and exposed toFuji RX X-ray film and developed according to the manufacturer'sinstructions. Re-probing of blots was performed after stripping byincubating in 20 mM glycine pH 2.5, 0.055% Tween 20 overnight at roomtemperature with continuous shaking.

After probing with HRP labelled anti-rabbit antibodies, a strong bandwas visible in the P.aeruginosa track and a band of similar molecularweight to the P.aeruginosa elastase was seen in the H.pylori cellularprotein extracts. After probing with the pooled sera from five patientswith high-titre antibodies against H.pylori, a large number of bandswere observed including a strong response to the band (35 kDa)previously identified by the anti-P.aeruginosa elastase antibody. Afterprobing with pooled sera from five patients not infected with H.pylorino bands were visible. Thus, a protein of similar size andimmuniological reactivity to the P.aeruginosa elastase protein was shownto be present in cellular and supernatant protein extracts of H.pyloriNCTC 11638.

EXAMPLE 3 Cloning of the H.pylori hap Gene

5 μg of HindIII and BamHI digested H.pylori NCTC 11638 genomic DNA wereseparated and blotted onto Hybond N (Amersham) as described in Smith etal. (1992), Gene, 114:211-216. The genomic blots were probed overnightwith 200 ng of a 3.2 kb HindIII fragment of V.cholerae hap gene DNA and5 ng of HindIII--digested phage lambda DNA directly labelled with HRP(Amersham ECL kit), washed, developed and exposed to Fuji RX X-ray filmaccording to the manufacturer's instructions. A 4 kb HindIII and a 1.5kb BamHI fragment of H.pylori DNA hybridized strongly to the V.choleraehap gene probe.

Both the 4 kb (HindIII) and the 1.5 kb (BamHI) fragments were clonedfrom H.pylori NCTC 11638 genomic DNA by the following method:

Two 10 μg portions genomic DNA were digested with either BamHI orHindIII and size separated on a 0.5×TBE minigel. The 4 kb HindIIIfragment and the 1.5 kb BamHI fragment were extracted from the gel andligated separately into either HindIII or BamHI--digested (respectively)and calf alkaline phosphatased pUC18, pUC19 (FIG. 2) and pAT153 asdescribed in Smith, A. W., (1990), Ph.D. Thesis, University ofNottingham, Nottingham, United Kingdom. The resultant plasmids weretransformed into E.coli strain DH5α and plated out onto Lbroth agarplates containing 100 mg/l ampillicin and incubated overnight at 37° C.The desired recombinants were indentified by colony lifts onto Hybond Nand screened by colony hybridization using the 3.2 kb HindIII V.cholerae(ECL) probe prepared above according to the manufacturer's instructions.The 1.5 kb BamHI fragment was sequenced in both directions using custommade, universal and reverse sequencing primers and Sequenase Version 2(United States Biochemical Corporation) according to the manufacturer'sinstructions.

Sequencing of the 1.5 kb fragment did not reveal a start codon but didreveal a region of about 1 kb that was over 99% identical to theV.cholerae hap gene sequence with only three base in-frame deletion of ahistidine residue in the coding region (ΔH in FIG. 1). The sequencesdiverge completely about 50 bp downstream of the stop codon, with theother bases differing before the complete divergence. The clonedsequence is then quite different from the 3'flanking region of theV.cholerae hap locus. Such coding sequence conservation is highlyunusual and difficult to explain either by a common precursor gene or byintrageneric gene transfer. The %G+C content for H.pylori is 34-37%while for V.cholerae it is 46-48%, the subsequent difference in codonusage between the two genera should have allowed the DNA sequences todiverge even if the amino acid sequences were still conserved.

EXAMPLE 4 PCR of H.pylori Genomic DNA

50 ng of H.pylori genomic DNA from the four strains NCTC 11638, NCTC11637, NCTC 11916 and HP34 were amplified using DynaZyme (Flowgen)thermastable polymerase according to the manufacturer's instructions.The primers used were either specific to the H.pylori hap gene5'-GCACAGGCAACAGGAACC-3' and 5'-AACGAGGCCTGAATTCTGC-3' or thosepreviously published to amplify a 411 bp fragment in the H.pylori ureAgene (Clayton et al, 1992 J.Clin.Microbiol 30,192-200 and Lopez et al,1993, Mol.Cell Probes, 7,439-446.

The reaction cycle profiles were as follows, one cycle at 95° C. for 5min; thirty cycles at 94° C. for 30 sec, either 52° C. (hap) or 48° C.(ureA) for 1 min, 72° C. for 1 min. 30 secs; and one cycle at 72° C. for5 min. The PCR products were separated on 1.5% agarose Tris acetategels, stained with ethidium bromide, blotted and hybridized as describedin Example 2.

All four stains gave the 575 bp product predicted from the cloned hapgene sequence, although polymorphisms similar to those seen in theSouthern blot analysis of strain HP34 were evident. Similarly, all fourstrains gave the 411 bp product from the ureA gene thereby confirmingthe origin of the genomic DNA. Genomic DNA from the nine Helicobacterspecies H.acinonyx, H.felis, H.fennelliae, H.canis, H.muridarum,H.nemestrinae, H.mustelae, H.cinaedi and H.pylori NCTC 11638 wasamplified using the HAP gene primers and all nine gave a 575 bpfragment. Some polymorphisms were observed noticeably with H.acinonyx,H.felis, H-canis and H.nemestrinae, which were also evident fromSouthern analysis of genomic DNA from these species. This separation wasblotted and probed with the V.cholerae hap gene and of the PCR productsproduced from H.pylori genomic DNA only the 575 bp hap fragment, and notthe 411 bp (ureA) fragment, hybridized. The 575 bp PCR products from allnine Helicobacter species were shown to hybridize with the V.choleraehap gene probe.

Due to the deletion between the H.pylori hap gene and the V.cholerae hapgene, PCR using non proof reading enzymes would result in noamplification of any homologous V.cholerae hap gene sequence present.

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 6                                                  (2) INFORMATION FOR SEQ ID NO: 1:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 1461 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iii) ANTI-SENSE: NO                                                          (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: HELICOBACTER PYLORI                                             (B) STRAIN: H.PYLORI NCTC 11638                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:                                      GAATTCAGGCCTCGTTTACCGAGATATGTCCGGTGGTATTAACGAAGCATTCTCGGATAT60                CGCAGGGGAAGCGGCAGAGTACTTTATGCGTGGCAATGTCGACTGGATTGTCGGCGCGGA120               TATTTTTAAATCCTCCGGTGGTCTACGTTATTTCGATCAGCCGTCACGTGATGGCCGCTC180               GATAGATCATGCTTCACAGTATTACAGCGGTATTGATGTTCATTCGAGTGGCGTGTTTAA240               CCGCGCGTTTTACCTACTCGCCAATAAATCGGGTTGGAACGTACGTAAAGGTTTTGAAGT300               GTTTGCCGTGGCTAACCAGTTGTACTGGACACCGAACAGCACGTTTGATCAAGGTGGCTG360               TGGGGTAGTGAAAGCGGCGCAGGATCTCAACTACAACACCGCAGACGTTGTGGCAGCCTT420               TAATACCGTGGGTGTCAATGCTTCTTGTGGCACCACGCCACCACCTGTCGGCAAAGTGCT480               TGAGAAAGGTAAACCGATCACAGGACTGAGCGGCTCACGTGGAGGAGAAGATTTCTATAC540               CTTTACGGTGACCAATTCAGGCAGTGTTGTTGTGTCCATCAGTGGTGGAACGGGCGATGC600               GGATCTGTATGTCAAAGCGGGCAGCAAACCCACCACCTCTTCTTGGGATTGTCGTCCATA660               CCGTTCAGGCAATGCCGAGCAGTGTTCCATCTCTGCGGTCGTGGGTACGACATACCATGT720               CATGTTACGCGGTTACAGTAACTATTCTGGTGTGACGTTACGCTTGGACTAACTTCCTTG780               CCACCTACCTGCAACGCCCTCAGCAAAGTCTGAGGGCGTTGTTTTTGAAGGGCAGTTTCT840               AGGATGTATCAACTATTTGAGTTGGCTGACCGCCGAAGAAACATTTTCTGCACCTTGGTA900               AATCTGTTCCATGATGGTTGACACTTCCACAATGCGGCCATTGGTTTCATTGGCAATGTG960               AGAGACTTGAGAAATAGAGCTGGTTACCGCTTCAGTTAAGGAGAGGTTTTTATTCACCAC1020              TTGATTGATCTCTTCTGTGGCTTTTGAGGTGCGAGAACGCAGTTGACCGACTTCATCGGC1080              AAACCACCGCAAAACCGCGTCCTTGATCAACCGCTGCGACGCTCTATCGCTGCATTAATG1140              CGGAGCAGATGGTTTGGTCAGCGATACACTGATGGTTTTGACAATTTCAGAGACATCTTT1200              CGAGAGAACCACGAGCTGCTCAATCTGTTGTAGTGATTGTTCGATATTGCCCACCATTTT1260              TTCTGCTAACGACTGAATCTTGCAGTACATGTTCCGCCTTGTGCTACTTGCGAGGTTCTA1320              CTGAGGTGCTATAGGCGATATTGGCAGCGTCTGTTACTTGCTGTTCACGTAATACCTCTG1380              CAGTGATGTCTGAGGCGAACTTGACAATTTTATATACCTTATTGTTTTGATCTTTGACCG1440              GACTGTAGGAGGCTTGGATCC1461                                                     (2) INFORMATION FOR SEQ ID NO: 2:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 18 bases                                                          (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: synthetic                                                 (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (v) ORIGINAL SOURCE:                                                          (A) ORGANISM: HELICOBACTER PYLORI                                             (B) STRAIN: H.PYLORI NCTC 11638                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:                                      GCACAGGCAACAGGAACC18                                                          (2) INFORMATION FOR SEQ ID NO: 3:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 19 bases                                                          (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: synthetic                                                 (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: YES                                                          (v) ORIGINAL SOURCE:                                                          (A) ORGANISM: HELICOBACTER PYLORI                                             (B) STRAIN: H.PYLORI NCTC 11638                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:                                      AACGAGGCCTGAATTCTGC19                                                         (2) INFORMATION FOR SEQ ID NO: 4:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 bases                                                          (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: synthetic                                                 (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: YES                                                          (v) ORIGINAL SOURCE:                                                          (A) ORGANISM: HELICOBACTER PYLORI                                             (B) STRAIN: H.PYLORI NCTC 11638                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:                                      ATAACGTAGACCACCGGAGG20                                                        (2) INFORMATION FOR SEQ ID NO: 5:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 bases                                                          (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: synthetic                                                 (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (v) ORIGINAL SOURCE:                                                          (A) ORGANISM: HELICOBACTER PYLORI                                             (B) STRAIN: H.PYLORI NCTC 11638                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:                                      TCCGGTGGTATTAACGAAGC20                                                        (2) INFORMATION FOR SEQ ID NO: 6:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 256 amino acids                                                   (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: unknown                                                         (ii) MOLECULE TYPE: protein                                                   (iii) HYPOTHETICAL: NO                                                        (v) FRAGMENT TYPE: internal                                                   (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: HELICOBACTER PYLORI                                             (B) STRAIN: H.PYLORI NCTC 11638                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:                                      AsnSerGlyLeuValTyrArgAspMetSerGlyGlyIleAsnGluAla                              151015                                                                        PheSerAspIleAlaGlyGluAlaAlaGluTyrPheMetArgGlyAsn                              202530                                                                        ValAspTrpIleValGlyAlaAspIlePheLysSerSerGlyGlyLeu                              354045                                                                        ArgTyrPheAspGlnProSerArgAspGlyArgSerIleAspHisAla                              505560                                                                        SerGlnTyrTyrSerGlyIleAspValHisSerSerGlyValPheAsn                              65707580                                                                      ArgAlaPheTyrLeuLeuAlaAsnLysSerGlyTrpAsnValArgLys                              859095                                                                        GlyPheGluValPheAlaValAlaAsnGlnLeuTyrTrpThrProAsn                              100105110                                                                     SerThrPheAspGlnGlyGlyCysGlyValValLysAlaAlaGlnAsp                              115120125                                                                     LeuAsnTyrAsnThrAlaAspValValAlaAlaPheAsnThrValGly                              130135140                                                                     ValAsnAlaSerCysGlyThrThrProProProValGlyLysValLeu                              145150155160                                                                  GluLysGlyLysProIleThrGlyLeuSerGlySerArgGlyGlyGlu                              165170175                                                                     AspPheTyrThrPheThrValThrAsnSerGlySerValValValSer                              180185190                                                                     IleSerGlyGlyThrGlyAspAlaAspLeuTyrValLysAlaGlySer                              195200205                                                                     LysProThrThrSerSerTrpAspCysArgProTyrArgSerGlyAsn                              210215220                                                                     AlaGluGlnCysPheIleSerAlaValValGlyThrThrTyrHisVal                              225230235240                                                                  MetLeuArgGlyTyrSerAsnTyrSerGlyValThrLeuArgLeuAsp                              245250255                                                                     __________________________________________________________________________

We claim:
 1. An isolated nucleic acid molecule encoding H. pylori HAPprotein comprising a nucleic acid molecule which encodes the amino acidsequence of Sequence I.D. No.
 6. 2. The nucleic acid molecule as claimedin claim 1, comprising a nucleotide sequence (a) selected from the groupconsisting of nucleotides 1-936 of SEQ ID NO: 1 and 1-769 of SEQ ID NO:1, and (b) complements of (a).
 3. A nucleic acid molecule as claimed inclaim 1 which is selected form the group consisting of genomic DNA,cDNA, and synthetic DNA.
 4. A vector comprising a nucleic acid moleculeas claimed in claim
 3. 5. A vector as claimed in claim 4 which is pUC18or pUC19.
 6. A vector as claimed in claim 4 which is an expressionvector.
 7. A vector as claimed in claim 6 which is the expression vectorpASK60-Strep.
 8. A host cell transformed with a vector as claimed inclaim
 4. 9. A process for determining the expression of an H. pylori hapnucleic acid molecule comprising amplifying the H. pylori hap nucleicacid molecule using an oligonucleotide pair selected from the groupconsisting of (a) Sequence I.D. No. 2 and Sequence I.D. No. 3: and (b)Sequence I.D. No. 4 and Sequence I.D. No.
 5. 10. A nucleic acid probecomprising a nucleic acid sequence comprising a nucleotide sequenceselected from the group consisting of nucleotides 206-225, 762-180,1-936 and 1-769 of SEQ ID NO:1, or a complementary sequence thereto. 11.A method for the identification of an H. pylori nucleic acid moleculecomprising contacting a sample to be tested with a nucleic acid probeselected from the groups consisting of a probe comprising 15-50contiguous nucleotides of Sequence I.D. No. 1 or its complement and aprobe as claimed in claim 10, under appropriate conditions and detectinghybridization of the H. pylori nucleic acid molecule with the nucleicacid probe.
 12. A method for the identification of H. pylori in a samplecomprising amplifying an H. pylori nucleic acid molecule in the sampleusing an oligonucleotide pair selected from the group consisting of (a)Sequence I.D. No. 2 and Sequence I.D. No. 3; and (b) Sequence I.D. No. 4and Sequence I.D. No. 5 and detecting the amplified nucleic acidmolecule.
 13. A kit for the identification of an H. pylori nucleic acidmolecule comprising a nucleic acid molecule as claimed in claim
 1. 14. Akit for the identification of H. pylori hap gene nucleic acid moleculeby nucleic acid amplification comprising a pair of oligonucleotideprimers comprising two non-overlapping contiguous fragments of SEQ IDNO: 1 or its complement.
 15. A nucleic acid molecule encoding a fragmentof H. pylori HAP protein comprising a nucleotide sequence selected fromthe group consisting of nucleotides 206-225, 762-780, 1-936, and 1-769of Sequence I.D. No.
 1. 16. An oligonucleotide of from 18-50 nucleotidesof Sequence I.D. No. 1 selected from the group consisting of a nucleicacid molecule comprising the nucleotide sequence of Sequence I.D. No. 4and a nucleic acid comprising the nucleotide sequence of Sequence I.D.No.
 5. 17. A nucleic acid molecule comprising the nucleotide sequence ofSEQ ID NO:1, wherein one of the codons encoding an amino acid isreplaced with a stop codon.
 18. A nucleic acid probe comprising anucleic acid sequence comprising a nucleotide sequence of anoligonucleotide as claimed in claim 16.